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3  Introduction 


Increasing  the  enrollment  of  patients  in  clinical  trials  is  important  to  making  progress  towards 
finding  more  effective  treatments  for  breast  cancer.  Accrual  is  complicated  by  a  large  number 
potential  studies  and  the  cost  and  complexity  of  determining  whether  a  patient  meets  the  nec¬ 
essary  eligibility  criteria.  Under  this  proposal,  we  are  developing  a  Web  based  expert  system 
'which  can  determine  the  patients  eligibility  for  clinical  trials.  The  expert  system  is  designed 
to  take  into  account  the  cost  of  tests  which  are  required  to  meet  inclusion  criteria  and  acquire 
information  in  the  most  cost-effective  way  possible. 

Additionally,  it  is  important  to  be  able  to  easily  add  and  remove  clinical  trials  to  the  system. 
Trials  are  continually  becoming  available,  going  on  suspension  or  being  closed  to  accrual.  Towards 
this  end,  we  have  developed  a  companion  Web  based  system  that  enables  anyone  to  simply  enter 
the  information  required  to  describe  the  eligibility /ineligibility  criteria  for  a  clinical  trial.  A 
newly  entered  trial/protocol  can  then  be.  directly  included  in  the  Clinical  trial  assignment  expert 
system  with  no  expert  intervention. 


4  Body 

In  this  second-year,  we  have  refined  the  original  prototype  to  produce  version  1.3.2.  We  have 
tested  it  with  data  from  187  retrospective  patients  and  we  have  extensively  tested  its  ability  to 
order  questions  associated  with  tests  to  save  dollar  costs  on  30  patients.  We  have  further  tested 
with  57  current  patients  and  are  continuing  to  test  new  patients  as  data  becomes  available.  Table 
1  summarizes  our  results  on  the  current  patients.  Patients  are  only  evaluated  for  trials  that  are 
currently  enrolling  patients.  The  trial  status  can  change  when  a  trial  is  put  on  suspension, 
closed,  brought  off  suspension,  or  initiated.  It  can  be  seen  that  the  system  finds  all  matches  that 
correspond  to  trials  that  patients  have  been  enrolled  in  with  one  exception  in  which  there  is  some 
missing  data.  Also,  the  57  patients  have  been  found  eligible  for  37  trials  on  which  they  were 
not  enrolled.  Clearly,  a  set  of  patients  who  are  eligible  for  clinical  trials  are  not  being  enrolled 
for  some  reason  (s)  (there  are  28  in  this  class).  On  the  day  this  is  written,  we  have  74  current 
patients  that  have  been  run  through  the  system. 

We  have  verified  that  the  system  correctly  finds  protocols  for  which  patients  are  eligible.  We 
are  investigating  the  cases  where  the  system  finds  patients  eligible  for  a  protocol  but  they  do  not 
go  on  the  protocol.  These  patients  fall  into  two  classes:  the  class  of  patients  put  on  a  different 
protocol  and  the  class  of  patients  not  put  on  any  protocol.  There  are  now  12  protocols  available 
in  the  system.  Some  of  these  are  closed.  At  the  present  time,  all  breast  cancer  protocols  at  the 
Moffitt  Cancer  Center  which  are  accruing  at  least  two  patients  a  month  :are  available  through 
our  system. 

It  is  important  to  be  able  to  add  new  trials/protocols  in  a  time  efficient  manner.  It  is  also 
important  that  the  process  be  such  that  it  is  straightforward  for  a  physician  or  nurse  or  medical 
worker  to  enter  the  information  from  the  eligibility/ineligibility  criteria.  Initially,  it  was  taking 
us  approximately  one  week  of  the  time  of  a  computer  science  expert  to  enter  a  new  protocol.  This 
year  we  have  developed  a  prototype  system  which  enables  a  user  to  enter  a  new  trial/protocol 
in  about  an  hour.  We  have  tested  it  with  novice  users  [1,  2]  and  found  that  they  learn  to  use 
the  system  quite  quickly.  It  is  our  conjecture  that  anyone  with  a  modicum  of  medical  knowledge 
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and  access  to  the  eligibility /ineligibility  criteria  or  inclusion/exclusion  criteria  can  enter  a  new 
protocol. 

Key  Research  Accomplishments: 

•  We  have  enhanced  our  prototype  system  to  very  stable  version  1.3.2.  We  have  added  cost 
functionality  and  tested  this  successfully. 

•  Utilizing  retrospective  patient  data  and  current  patient  data,  it  has  been  found  that  patients 
are  eligible  for  multiple  protocols/trials.  Further,  with  current  patient  data  we  find  patients 
eligible  for  trials  and  not  put  on  any  trial. 

•  Extensive  testing  of  cost  functionality  has  been  done.  It  has  been  shown  that  the  average 
cost  of  determining  eligibility  may  be  significantly  reduced  (by  over  60%)  when  the  cost 
functionality  portion  of  the  extra  system,  is  utilized. 

,  •  An  automated  protocol  acquisition  tool  has  been  developed,  version  1.7.  It  has  been  tested 

with  novice  users  and  found  to  be  quite  usable.  It  is  now  how  we  add  new  protocols.  A 
new  protocol  takes  about  1  hour  to  enter. 

Reportable  Outcomes:  We  have  had  two  papers  [2,  3],  which  are  attached,  accepted  to  the 
2002  IEEE  International  Conference  on  Systems,  Man,  and  Cybernetics.  We  are  in  the  process 
of  revising  a  journal  submission  that  got  reasonably  positive  reviews.  A  web  prototype  of  the 
clinical  trial  assignment  system  is  available  at  http://morden.csee.usf.edu/moffit  with  password 
available  from  the  principal  investigator. 

5  Conclusions 

We  have  developed  a  scalable  prototype  which  currently  can  determine  eligibility  for  twelve  breast 
cancer  clinical  trials.  The  system  has  been  tested  using  retrospective  data  from  187  patients  who 
are  assigned  to  some  clinical  trial.  Its  accuracy  has  been  verified.  The  system  correctly  finds 
cases  in  which  a  patient  is  eligible  for  multiple  clinical  trials.  This  will  enable  a  physician  to  make 
the  best  choice  from-  available  trials.  The  system  is  able  to  utilize  monetary  cost  in  requesting 
tests  to  rule  in/rule  out  a  patient  from  the  set  of  available  clinical  trials.  The  default  ordering  of 
questions  allows  the  system  user  to  rapidly  determine  the  eligibility  or  ineligibility  of  a  patient 
for  any  subset  of  the  available  clinical  trials  entered  into  the  system.  We  have  been  able  to 
show  a  significant  average  cost  saving  (over  60%)  by  using  the  cost  feature  to  order  questions. 
Of  course,  there  is  no  guarantee  that  a  clinician  would  order  tests  as  suggested  by  the  question 
ordering  of  our  system.  However,  the  potential  for  cost  savings  is  significant. 

The  system  is  Web  based  and  password  protected.  It  provides  rapid  response  when  a  person 
enters  answers  to  one  or  more  questions  on  a  page  of  system  selected  questions.  It  can  be  used 
from  any  computer  on  the  World  Wide  Web.  Hence,  community  physicians  will  be  able  to 
determine  the  potential  eligibility  (they  may  not  wish  to  run  all  tests)  of  the  patient  for  clinical 
trials  at  cancer  centers  in  their  region. 
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A  prototype  to  enable  physicians,  nurses  or  technicians  to  enter  new  protocols  has  been 
completed.  The  system  is  now  in  use.  It  reduces  the  time  required  to  add  a  new  trial  or  protocol 
to  approximately  1  hour.  It  enables  non-computer  scientists  to  add  trial/protocols  to  the  system. 
This  knowledge  acquisition  tool  has  been  designed  to  minimize/eliminate  the  cases  where  similar 
questions  acquiring  essentially  the  same  information  would  have  to  be  asked.  This  feature  has 
the  potential  to  cause  slight  changes  to  the  wording  of  inclusion/exclusion  criteria.  We  believe 
that  this  change  is  minor  and  will  have  no  effect  on  IRB  approval.  However,  this  year  we  will 
have  new  protocols  entered  using  existing  questions  and  go  back  to  the  IRB  board  to  discuss 
any  changes  in  criteria  wording  to  fit  existing  questions  within  the  system.  An  example  would 
be  a  protocol  in  which  there  are  two  questions  which  ask  is  a  test  value  is  greater  than  some 
threshold  and  then  a  separate  question  that  asks  if  it  is  less  than  some  threshold,  versus  a  single 
question  which  asks  if  a  test  is  in  some  range.  We  believe  that  such  a  change  is  trivial,  but  this 
must  be  addressed  in  practice  and  we  will  evaluate  whether  it  causes  review  board  decisions  to 
potentially  change. 

5.1  So  What 

The  prototype  system  shows  the  potential  for  allowing  community  physicians,  as  well  as  cancer 
center  physicians,  to  quickly  and  cost  effectively  determine  for  which  clinical  trials  a  patient  may 
be  eligible.  It  holds  the  promise  of  enabling  greater  patient  accrual  for  trials  by  increasing  the 
awareness  of  each  trial  for  treating  physicians  throughout  a  region.  In  this  next  year,  we  will 
be  evaluating  how  many  patients  not  eligible  for  clinical  trials  were  actually  missed  by  clinical 
practitioners  vs.  excluded  for  a  particular  reason  (e.g.  it  was  clear  they  would  not  agree)  or  were 
offered  a  trial  and  declined  to  enter  it. 
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Table  1:  Current  Patient  data  :  Number  of  patients  checked  is  57.  Number  of  currently  accruing 
trials  is  7. 


Clinical  Trial 
Number 

Same 

Matches 

New 

Matches 

Missing  Data 

Patients  Checked 

Predicted  Eligibility 

11132 

4 

1 

1 

7 

5 

11931 

1 

8 

0 

57 

9 

11971 

3 

0 

0 

56 

3 

12100 

0 

2 

0 

55 

2 

12101 

4 

21 

o 

55 

25 

12601 

0 

1 

0 

50 

1 

12775 

1 

4 

0 

24 

5 

Total 

13 

37 

1 

57 

50 

7 


A  Cost-Effective  Agent  for  Clinical  Trial  Assignment 

Princeton  K.  Kokku,  Lawrence  0.  Hall,  Dmitry  B.  Goldgof,  Eugene  Fink,  and  Jeffrey  P.  Krischer 

kokku@csee.usf.edu,  hall@csee.usf.edu,  goldgof@csee.usf.edu, 
eugene@csee . usf.edu ,  j pkr ischer@moffitt . usf.edu 

Computer  Science  and  Engineering,  University  of  South  Florida,  Tampa,  Florida  33620 


Abstract —  The  purpose  of  a  clinical  trial  is  to  eval¬ 
uate  a  new  treatment  procedure.  When  medical  re¬ 
searchers  conduct  a  trial,  they  recruit  participants 
with  appropriate  medical  histories.  To  select  par¬ 
ticipants,  the  researchers  analyze  medical  records  of 
the  available  patients,  which  has  traditionally  been  a 
manual  procedure.  We  describe  an  intelligent  agent 
that  helps  to  select  patients  for  clinical  trials.  If  the 
available  data  are  insufficient  for  choosing  patients, 
the  agent  suggests  additional  medical  tests  arid  finds 
an  ordering  of  the  tests  that  reduces  their  total  cost. 

Keywords — Medical  expert  systems,  automated  di¬ 
agnosis,  clinical  trials. 

I.  Introduction 

A  clinical  trial  is  an  experiment  with  a  new  treat¬ 
ment  procedure.  When  medical  researchers  test  a  new 
treatment,  they  recruit  patients  with  appropriate  health 
problems  and  medical  histories.  The  selection  of  pa¬ 
tients  has  traditionally  been  a  manual  procedure,  and 
recent  studies  have  shown  that  clinicians  can  miss  up  to 
60%  of  the  eligible  patients  [9,  10,  14,  26,  35,  38]. 

If  the  available  records  do  not  provide  enough  data, 
clinicians  perform  medical  tests  as  part  of  the  selection 
process.  The  costs  of  most  tests  have  declined  over  the 
last  decade,  but  the  number  of  tests  has  significantly  in¬ 
creased  [33,  36],  which  is  partially  due  to  inappropriate 
ordering  of  tests  [1,  25}.  Clinicians  can  reduce  the  cost 
by  first  requiring  inexpensive  tests  and  then  using  their 
results  to  avoid  some  expensive  tests;  however,  finding 
the  right  ordering  may  be  a  complex  problem. 

The  purpose  of  the  described  work  is  to  automate  the 
selection  of  patients  for  clinical  trials  and  minimize  the 
cost  of  related  tests.  We  have  developed  an  agent  that 
identifies  appropriate  trials  for  each  patient,  and  built  a 
knowledge  base  for  breast-cancer  trials. 

II.  Previous  Work 

Researchers  began  to  work  on  medical  expert  systems 
in  the  early  seventies.  Shortliffe  et  al  developed  the 
MYCIN  system,  which  diagnosed  bacterial  diseases  [5,  30, 
31].  Its  knowledge  base  consisted  of  if-then  rules,  which 
allowed  for  the  analysis  of  symptoms  and  evaluation  of 
the  certainty  of  the  diagnosis.  Experiments  showed  that 
MYCIN  correctly  diagnosed  common  diseases,  which  led 
to  the  development  of  other  medical  systems  [5,  19],  such 
as  NEOMYCIN,  PUFF,  CENTAUR,  and  VM.  Shortliffe  et  al. 
created  a  system  for  selecting  chemotherapy  treatments, 
called  ONCOCIN  [32],  which  also  evolved  from  MYCIN. 

Lucas  et  al  constructed  a  rule-based  system  for  diag¬ 
nosing  liver  and  biliary- tract  diseases  [16],  but  it  often 


gave  an  incorrect  diagnosis  [12,  23].  Korver  and  Lucas 
converted  the  initial  system  into  a  Bayesian  network, 
which  improved  its  performance  [13,  15]. 

Musen  et  al.  built  a  rule-based  system,  called  EON, 
that  selected  AIDS  patents  for  clinical  trials  [20].  Ohno- 
Machado  et  al  developed  the  aids2  system,  which  also 
assigned  AIDS  patients  to  clinical  trials  [21].  They  in¬ 
tegrated  logical  rules  with  Bayesian  networks,  which 
helped  to  make  decisions  in  the  absence  of  some  data. 

Bouaud  et  al  created  a  cancer  expert  system,  called 
ONCODOC,  that  suggested  alternative  clinical  trials  for 
each  patient  and  allowed  a  physician  to  choose  among 
them  [3,  4].  Seroussi  et  al  used  ONCODOC  to  select 
participants  for  clinical  trials  at  two  hospitals,  which 
helped  to  increase  the  number  of  selected  patients  by  a 
factor  of  three  [27,  28,  29]. 

Hammond  and  Sergot  created  the  OaSiS  architec¬ 
ture  [11],  which  combined  the  techniques  from  earlier 
systems,  including  EON  and  ONCOCIN.  Smith  et  al  built 
a  system  that  assisted  a  clinician  in  selecting  medical 
tests  and  reducing  their  number  and  cost  [17,  18,  33]. 

Fallowfield  et  al  studied  how  physicians  selected  can¬ 
cer  patients  for  clinical  trials,  and  compared  manual 
and  automatic  selection  [8].  They  showed  that  expert 
systems  could  improve  the  selection  accuracy;  however, 
their  study  also  revealed  that  physicians  were  reluctant 
to  use  these  systems.  Carlson  et  al  conducted  similar 
studies  with  AIDS  trials,  and  also  concluded  that  expert 
systems  could  lead  to  a  more  accurate  selection  [6]. 

Theocharous  developed  a  Bayesian  system  that  se¬ 
lected  clinical  trials  for  cancer  patients  [24,  34].  It 
learned  conditional  probabilities  of  medical-test  out¬ 
comes  and  evaluated  the  probability  of  a  patient’s  eligi¬ 
bility  for  each  trial.  On  the  negative  side,  the  available 
medical  records  were  often  insufficient  for  learning  ac¬ 
curate  probabilities.  Furthermore,  when  adding  a  new 
clinical  trial,  the  user  had  to  change  the  structure  of  the 
underlying  Bayesian  network. 

To  address  these  problems,  Bhgmja  et  al  built  a  rule- 
based  system  for  the  same  task  [2].  We  have  continued 
that  work,  extended  the  system,  and  added  a  mecha¬ 
nism  for  reducing  costs  involved  in  patient  selection. 

III.  Example 

We  have  developed  an  intelligent  agent  that  helps  to 
select  clinical  trials  for  eligible  patients.  It  prompts  a 
clinician  to  enter  the  results  of  medical  tests,  and  iden¬ 
tifies  appropriate  trials.  If  the  available  records  do  not 
provide  enough  data,  the  agent  suggests  additional  tests. 

In  Figure  1(a),  we  give  a  simplified  example  of  eligibil- 


(a)  Eligibility  criteria 

1.  The  patient  is  female. 

2.  She  is  at  most  forty-five  years  old. 

3.  Her  cancer  stage  is  II  or  ill. 

4.  Her  cancer  is  not  invasive. 

5.  At  most  three  lymph  nodes  have  tumor  cells. 

6.  Either 

•  the  patient  has  no  cardiac  arrhythmias,  or 

•  all  tumors  are  smaller  than  2.5  centimeters. 

(b)  Tests  and  questions 

General  information 
What  is  the  patient’s  sex? 

What  is  the  patient’s  age? 

Mammogram ,  Cost  is  $150 
What  is  the  cancer  stage? 

Does  the  patient  have  invasive  cancer? 

Biopsy,  Cost  is  $300  ■ 

What  is  the  cancer  stage? 

How  many  lymph  nodes  have  tumor  cells? 

What  is  the  greatest  tumor  size? 

Electrocardiogram,  Cost  is  $200 

Does  the  patient  have  cardiac  arrhythmias? 

Fig.  1.  Example  of  eligibility  criteria,  tests,  and  questions. 


(a)  Acceptance 

sex  —  female  and 
age  <  45  and 
stage  €  {il,  ill}  and 
invasive  =  NO  and 
lymph-nodes  <  3  and 
( arrhythmias  —  NO  or 
tumor-size  <  2.5) 

Fig.  2.  Logical  expressions  fo!r  the  criteria  in  Figure  1(a). 


(b)  Rejection 
sex  —  MALE  or 
age  >  45  or 
cancer  £  {i,  iv)  or 
invasive  —  YES  or 
lymph-nodes  >  3  or 
( arrhythmias  =  YES  and 
tumor- size  >2.5) 


ity  criteria  for  a  clinical  trial.  This  trial  is  for  young  and 
middle-aged  women  with  a  noninvasive  cancer  at  stage 
II  or  ill.  When  testing  a  patient’s  eligibility,  a  clinician 
has  to  order  three  medical  tests  (Figure  lb). 

The  agent  first  prompts  a  clinician  to  enter  the  pa¬ 
tient’s  sex  and  age.  If  the  patient  satisfies  the  corre¬ 
sponding  conditions,  the  agent  asks  for  the  mammo¬ 
gram  results  and  verifies  Conditions  3  and  4;  then,  it 
requests  the  biopsy  and  electrocardiogram  data.  If  the 
patient’s  records  already  include  some  test  results,  the 
clinician  can  answer  the  corresponding  questions  while 
entering  the  personal  data,  before  the  agent  selects  test 
procedures.  For  example,  if  the  records  indicate  that 
the  cancer  stage  is  IV,  the  clinician  can  enter  the  stage 
along  with  sex  and  age,  and  then  the  agent  immediately 
determines  that  the  patient  is  ineligible  for  this  trial. 

IV.  Knowledge  Base 

The  agent’s  knowledge  base  includes  questions,  med¬ 
ical  tests,  and  logical  expressions  that  represent  eligibil¬ 
ity  criteria  for  each  trial.  We  give  a  simplified  example 
of  tests  and  questions  in  Figure  1(b),  and  logical  expres¬ 
sions  in  Figure  2. 


/  sex  —  FEMALE  and  \ 
age  <45  and 
stage  6  {ll,  III}  and 
invasive  —  NO  and 
lymph-nodes  <  3  and 
\  arrhythmias  =  NO  / 

Fig.  3.  Disjunctive  normal  fo 


/  sex  =  FEMALE  and  \ 
age  <  45  and 
stage  €  {ll,  III}  and 
invasive  =  NO  and 
lymph-nodes  <  3  and 
\  tumor-size  <2.5  / 

of  the  acceptance  expression. 


The  agent  supports  three  types  of  questions;  the  first 
type  takes  a  yes/no  response,  the  second  is  multiple 
choice,  and  the  third  requires  a  numeric  answer.  For 
example,  the  cancer  stage  is  a  multiple-choice  question, 
and  the  tumor  size  is  a  numeric  question.  The  descrip¬ 
tion  of  a  medical  test  includes  the  test  name,  dollar  cost, 
and  list  of  questions  that  can  be  answered  based  on  the 
test  results  (Figure  1). 

We  encode  the  eligibility  for  a  clinical  trial  by  a  log¬ 
ical  expression  that  does  not  have  negations,  called  the 
acceptance  expression .  It  includes  variables  that  rep¬ 
resent  medical  data,  as  well  as  equalities,  inequalities, 
“set-element”  relations,  conjunctions,  and  disjunctions 
(Figure  2a).  In  addition,  the  agent  uses  the  logical  com¬ 
plement  of  the  eligibility  criteria,  called  the  rejection 
expression,  which  also  does  not  have  negations  (Fig¬ 
ure  2b).  It  describes  the  conditions  that  make  a  patient 
ineligible  for  the  trial. 

The  agent  collects  data  until  it  can  determine  which  of 
the  two  expressions  is  TRUE.  For  instance,  if  a  patient’s 
sex  is  MALE,  then  the  rejection  expression  in  Figure  2(b) 
is  TRUE,  and  the  agent  immediately  determines  that  this 
trial  is  inappropriate.  If  the  sex  is  FEMALE,  the  agent 
asks  more  questions. 

If  the  knowledge  base  includes  multiple  clinical  trials, 
the  agent  checks  a  patient’s  eligibility  for  each  of  them. 
It  first  asks  for  the  tests  related  to  multiple  trials,  and 
then  requests  additional  tests  for  specific  trials.  After 
getting  each  new  answer,  the  agent  re-evaluates  the  pa¬ 
tient’s  eligibility  for  each  trial. 

V.  Order  of  Tests 

If  a  patient’s  records  do  not  include  enough  data, 
the  agent  asks  for  additional  tests;  for  example,  if  the 
records  do  not  provide  data  for  the  eligibility  criteria  in 
Figure  1,  the  agent  asks  for  the  mammogram,  biopsy, 
and  electrocardiogram.  The  total  cost  of  tests  may  de¬ 
pend  on  their  order;  for  instance,  if  we  begin  with  the 
mammogram,  and  it  shows  that  the  cancer  stage  is  IV, 
then  we  can  immediately  reject  Idle  trial  in  Figure  1  and 
avoid  the  more  expensive  tests. 

We  have  explored  heuristics  for  ordering  the  tests, 
based  on  the  test  costs  and  the  structure  of  acceptance 
and  rejection  expressions.  The  heuristics  use  a  disjunc¬ 
tive  normal  form  of  these  expressions;  that  is,  each  ex¬ 
pression  must  be  a  disjunction  of  conjunctions.  For  ex¬ 
ample,  the  rejection  expression  in  Figure  2(b)  is  in  dis¬ 
junctive  normal  form,  whereas  the  acceptance  expres¬ 
sion  in  Figure  2(a)  is  not.  If  the  system  uses  ordering 
heuristics,  it  converts  this  acceptance  expression  into 
the  disjunctive  normal  form  shown  in  Figure  3. 


The  agent  chooses  the  order  of  tests  that  reduces  their 
expected  cost.  After  getting  the  results  of  the  first  test, 
it  re-evaluates  the  need  for  the  other  tests  and  revises 
their  ordering.  The  choice  of  the  first  test  is  based  on 
three  criteria.  The  agent  scores  all  required  tests  ac¬ 
cording  to  these  criteria,  computes  a  linear  combination 
of  the  three  scores  for  every  test,  and  chooses  the  test 
with  the  highest  score. 

1.  Cost  of  the  test.  The  agent  prefers  cheaper  tests. 
For  instance,  it  may  start  with  the  mammogram,  which 
is  cheaper  than  the  other  two  tests  in  Figure  1. 

2.  Number  of  clinical  trials  that  require  the  test. 
When  the  agent  checks  a  patient’s  eligibility  for  several 
trials,  it  prefers  tests  that  provide  data  for  the  largest 
number  of  trials.  For  example,  if  the  electrocardiogram 
gives  data  for  two  different  trials,  the  agent  may  prefer 
it  to  the  mammogram  despite  its  higher  cost. 

3.  Number  of  clauses  that  include  the  test  results. 
The  agent  prefers  the  tests  that  provide  data,  for  the 
largest  number  of  clauses  in  the  acceptance  and  rejec¬ 
tion  expressions.  For  example,  the  mammogram  data 
affect  both  clauses  of  the  acceptance  expression  in  Fig¬ 
ure  3  and  two  clauses  of  the  rejection  expression  in  Fig¬ 
ure  1(b).  On  the  other  hand,  the  electrocardiogram  af¬ 
fects  only  one  clause  of  the  acceptance  expression  and 
one  clause  of  the  rejection  expression;  thus,  the  agent 
should  order  it  after  the  mammogram. 

VI.  User  Interface 

The  agent  includes  a  web-based  interface  that  allows 
clinicians  to  enter  patients’  data  through  remote  com¬ 
puters;  the  interface  consists  of  five  screens  (Figure  4). 

The  start  screen  is  for  adding  and  retrieving  patients 
(Figure  5).  After  a  user  enters  a  patient’s  name,  the 
agent  displays  a  list  of  the  available  trials  (Figure  6). 
The  user  can  choose  a  subset  of  these  trials,  and  then 
the  agent  checks  eligibility  only  for  the  selected  trials. 
The  next  screen  is  for  basic  personal  and  medical  data, 
such  as  sex,  age,  and  cancer  stage  (Figure  7). 

After  the  agent  gets  the  basic  data,  it  prompts  the 
user  for  medical  information  related  to  specific  trials 
(Figure  8).  When  the  user  enters  medical  data,  the 
agent  continuously  re-evaluates  the  patient’s  eligibility 
and  shows  the  decision  for  each  trial.  If  the  patient 
is  ineligible  for  some  trials,  the  user  can  find  out  the 
reasons  by  clicking  the  “Why”  button.  The  interface 
also  includes  a  screen  for  the  review  and  modification  of 
the  previous  answers,  similar  to  the  screen  in  Figure  8. 

VII.  Experiments 

We  have  built  a  knowledge  base  for  the  breast-cancer 
clinical  trials  at  the  H.  Lee  Moffitt  Cancer  Center,  ap¬ 
plied  the  agent  to  retrospective  data  from  187  past  pa¬ 
tients  and  57  current  patients,  and  compared  the  results 
with  manual  selection  by  clinicians  at  the  cancer  center. 

We  summarize  the  results  for  the  past  patients  in  Ta¬ 
ble  I,  and  the  results  for  the  current  patients  in  Table  II. 
The  “same  matches”  column  includes  the  number  of  pa¬ 
tients  who  have  been  selected  by  both  human  clinicians 
and  the  automated  agent.  The  “new  matches”  column 
gives  the  number  of  patients  who  have  been  matched 


TABLE  I 

Results  of  matching  187  past  patients. 


Clinical 

Trial 

Same 

Matches 

New 

Matches 

Missing 

Data 

10822 

10 

5 

0 

10840 

0 

19 

3 

11072 

48 

26 

19 

11378 

4 

19 

3 

11992 

5 

6 

0 

12100 

8 

20 

13 

12101 

20 

30 

0 

TABLE  II 

Results  of  matching  57  current  patients. 


Clinical 

Trial 

Same 

Matches 

New 

Matches 

Missing 

Data 

11132 

4 

1 

1 

11971 

3 

0 

0 

12100 

0 

2 

0 

12101 

4 

21 

0 

12601 

0 

1 

0 

11931 

1 

8 

0 

12775 

1 

4 

0 

by  the  agent  but  potentially  missed  by  human  clini¬ 
cians.  Finally,  the  last  column  shows  the  number  of 
patients  whose  available  records  are  incomplete.  Clini¬ 
cians  have  found  trials  for  these  patients,  but  the  agent 
cannot  identify  these  matches  because  of  missing  data. 
The  agent  has  found  a  number  of  matches  potentially 
missed  by  human  clinicians;  thus,  it  can  help  to  recruit 
more  patients  for  clinical  trials. 

In  Table  III,  we  give  the  mean  test  costs  with  and 
without  the  ordering  heuristics  for  the  187  past  patients. 
The  results  show  that  the  implemented  heuristics  reduce 
the  costs  by  more  than  a  factor  of  two. 

VIII.  Scalability 

The  time  complexity  of  evaluating  the  acceptance  and 
rejection  expressions  is  linear  in  their  size.  Experiments 
on  a  Sun  Ultra  10  have  shown  that  the  evaluation  takes 
about  0.02  seconds  per  question,  and  the  time  is  linear  in 
the  number  of  questions.  Typical  eligibility  conditions 
for  a  clinical  trial  include  ten  to  thirty  questions;  thus, 
the  evaluation  time  is  0.2  to  0.6  seconds  per  trial. 


TABLE  III 

Cost  savings  by  test  reordering. 


Clinical 

Trial 

Average  Dollar  Cost  | 

Without  Test 
Reordering 

With  Test 
Reordering 

10822 

$20 

$8 

$0 

$0 

11072 

$556 

$194 

11378 

$34 

11992 

$87 

$34 

$0 

$0 

12101 

$24 

$22 

Adding  patients  Selecting  clinical  trials  Entering  initial  data  Entering  medical  data 

•  Add  a  new  patient  •  Choose  candidate  trials  •  Answer  initial  questions  •*.  •  Enter  test  results 

•  Find  an  old  patient  •  View  available  trials  •  Change  previous  answers  •  View  eligibility  decisions 


Revising  medical  data 

•  View  test  results 

•  Change  some  results 


Fig.  4.  Entering  a  patient’s  data.  The  web-based  interface  for  data  entry  consists  of  five  screens.  We  show  these  screens  by 
rectangles  and  the  transitions  between  them  by  arrows. 


Fig.  6.  Selecting  clinical  trials. 


Fig.  7.  Entering  basic  information  for  a  patient. 


Fig.  8.  Entering  medical  data. 


(a)  Eligibility  criteria 

1.  The  patient  is  female. 

2.  She  is  at  most  forty-five  years  old. 

3.  Either 

•  her  cancer  is  not  invasive,  or 

•  her  cancer  is  not  recurrent. 

4.  Either 

•  at  most  three  lymph  nodes  have  tumor  cells,  or 

•  all  tumors  are  smaller  than  2.5  centimeters. 

5.  Either 

•  the  patient  has  no  cardiac  arrhythmias,  or 

•  the  patient  has  no  congenital  heart  disease. 

(b)  Acceptance  expression 

sex  =  FEMALE  and 
age  <  45  and 

( invasive  —  NO  or  recurrent  =  NO)  and 
( lymph-nodes  <  3  or  tumor-size  <  2.5)  and 
( arrhythmias  =  NO  or  congenital  —  NO) 

(c)  Reduced  expression 

sex  —  FEMALE  and 
age  <45  and 

invasive- and-recurrent  =  NO  and 
( lymph-nodes  <  3  or  tumor-size  <  2.5)  and 
arrhythmias- and-  congenital  —  NO 


Fig.  9.  Reducing  the  number  of  disjunctions.  The  conversion 
of  the  eligibility  criteria  (a)  into  a  logical  expression  (b) 
leads  to  an  explosion  in  the  size  of  the  corresponding 
disjunctive  normal  form.  We  can  prevent  the  explosion 
by  replacing  some  disjunctions  with  single  questions  (c). 


The  linear  scalability  is  an  important  advantage  over 
Bayesian  systems,  which  do  not  scale  to  a  large  number 
of  clinical  trials  [7,  21,  23].  The  authors  of  these  systems 
have  reported  that  the  sizes  of  the  underlying  networks 
are  superlinear  in  the  number  of  trials  [22,  37],  and  the 
training  time  is  superlinear  in  the  network  size  [24,  34]. 

If  the  agent  uses  the  cost-reduction  heuristics,  it  con¬ 
verts  the  acceptance  and  rejection  expressions  into  dis¬ 
junctive  normal  form,  which  can  potentially  lead  to  an 
explosion  in  their  size.  For  example,  if  eligibility  con¬ 
ditions  are  as  shown  in  Figure  9(a),  the  agent  initially 
generates  the  expression  in  Figure  9(b).  If  the  agent 
converts  it  to  disjunctive  normal  form,  the  resulting  ex¬ 
pression  consists  of  eight  clauses. 

Although  the  conversion  may  result  in  unpractically 
large  expressions,  experiments  have  shown  that  this 
problem  does  not  arise  in  practice  because  the  number 
of  nested  disjunctions  is  usually  small.  Furthermore, 
we  can  eliminate  some  disjunctions  by  combining  their 
elements  into  longer  questions.  For  instance,  we  can 
represent  Condition  3  in  Figure  9(a)  by  a  single  ques¬ 
tion:  “Does  the  patient  have  both  invasive  and  recurrent 
cancer?”  If  we  apply  this  modification  to  Conditions  3 
and  5,  then  we  obtain  the  expression  in  Figure  9(c),  and 
its  conversion  to  disjunctive  normal  form  results  in  an 
expression  with  two  clauses. 


IX.  Concluding  Remarks 

We  have  developed  an  agent  that  automatically  as¬ 
signs  patients  to  clinical  trials.  We  have  described  the 
representation  of  selection  criteria,  heuristics  for  order¬ 
ing  of  tests,  and  a  web-based  interface  for  entering  pa¬ 
tients’  data,  which  will  enable  physicians  across  the 
country  to  access  a  central  repository  of  clinical  trials. 

Experiments  have  confirmed  that  the  agent  has  the 
potential  to  find  more  participants  for  clinical  trials. 
They  have  also  shown  that  the  ordering  of  medical  tests 
affects  their  overall  cost,  and  the  implemented  heuris¬ 
tics  can  reduce  the  cost  of  finding  trial  participants.  The 
heuristics  do  not  account  for  the  probabilities  of  possible 
test  results,  and  we  plan  to  add  probabilistic  reasoning 
as  part  of  the  future  work. 
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Abstract —  When  medical  researchers  test  a  new 
treatment  procedure,  they  recruit  patients  with  ap¬ 
propriate  medical  histories.  An  experiment  with  a 
new  procedure  is  called  a  clinical  trial.  The  selection 
of  patients  for  clinical  trials  has  traditionally  been  a 
labor-intensive  task,  which  involves  the  matching  of 
medical  records  with  a  list  of  eligibility  criteria,  and 
studies  have  shown  that  clinicians  can  miss  up  to 
60%  of  the  eligible  patients.  A  recent  project  at  the 
University  of  South  Florida  has  been  aimed  at  the 
automation  of  this  task.  We  have  developed  an  in¬ 
telligent  agent  that  selects  trials  for  eligible  patients. 
We  report  the  work  on  the  representation  and  entry 
of  the  related  knowledge  about  clinical  trials.  We 
describe  the  structure  of  the  agent’s  knowledge  base 
and  the  interface  for  adding  new  trials. 

Keywords — Knowledge  representation,  medical  ex¬ 
pert  systems,  user  interfaces. 

I.  Introduction 

Cancer  causes  550,000  deaths  in  the  United  States 
every  year,  and  the  treatment  of  cancer  is  an  active 
research  area.  Medical  experts  explore  new  treatment 
methods,  such  as  drugs,  surgery  techniques,  and  radi¬ 
ation  therapies.  An  experiment  with  a  new  treatment 
procedure  is  called  a  clinical  trial.  When  researchers 
conduct  a  trial,  they  recruit  patients  with  an  appro¬ 
priate  cancer  type  and  medical  history.  The  selection 
of  patients  has  traditionally  been  a  manual  procedure, 
and  studies  have  shown  that  clinicians  can  miss  up  to 
60%  of  the  eligible  patients  [12,  22,  30]. 

A  recent  project  at  the  University  of  South  Florida 
has  been  aimed  at  automatic  selection  of  patients  for 
clinical  trials.  We  have  developed  an  intelligent  agent 
that  prompts  a  clinician  for  a  patient’s  data  and  identi¬ 
fies  all  matching  trials  [1,  11].  It  includes  a  knowledge 
base  with  information  about  available  clinical  trials,  cri¬ 
teria  for  selecting  patients,  and  related  medical  tests. 

We  report  the  work  on  a  web-based  interface  that  en¬ 
ables  a  clinician  to  enter  new  trials  without  the  help 
of  a  programmer.  We  have  used  the  interface  to  build 
a  knowledge  base  for  clinical  trials  at  the  Moffitt  Can¬ 
cer  Center,  located  at  the  University  of  South  Florida. 
We  review  the  previous  work  on  medical  expert  systems 
(Section  II),  explain  the  knowledge  representation  in  the 
developed  agent  (Section  III),  and  describe  the  interface 
for  adding  new  knowledge  (Section  IV). 

II.  Previous  Work 

Researchers  began  to  work  on  medical  applications 
of  artificial  intelligence  in  the  early  seventies.  Short- 
liffe  and  his  colleagues  developed  the  MYCIN  system, 


which  diagnosed  bacterial  diseases  [5,  25,  26].  Exper¬ 
iments  showed  the  effectiveness  of  MYCIN,  which  led  to 
the  development  of  other  medical  systems  [5,  14],  such 
as  NEOMYCIN,  PUFF,  CENTAUR,  and  VM. 

Musen  et  al  built  a  rule-based  system,  called  EON, 
that  selected  AIDS  patents  for  clinical  trials  [17].  Ohno- 
Machado  et  al.  developed  the  AIDS2  system,  which  also 
assigned  AIDS  patients  to  clinical  trials  [19].  Bouaud  et 
al  created  a  cancer  expert  system,  called  ON CODOC, 
that  suggested  alternative  trials  for  each  patient  and  al¬ 
lowed  a  physician  to  choose  among  them  [3,  4].  Seroussi 
used  ON  CODOC  to  select  participants  for  clinical  trials 
at  two  hospitals,  which  helped  to  increase  the  number 
of  selected  patients  by  a  factor  of  three  [23,  24]. 

Early  expert  systems  did  not  have  knowledge- 
acquisition  tools,  and  programmers  hand-coded  the  re¬ 
lated  rules.  To  simplify  knowledge  entry,  researchers 
implemented  specialized  tools  for  some  systems  [13,  15]. 

Eriksson  pointed  out  the  need  for  tools  that  would  al¬ 
low  efficient  knowledge  acquisition,  and  described  a  sys¬ 
tem  for  building  such  tools  [6].  Tallis  et  al.  developed  a 
library  of  scripts  for  modifying  knowledge  bases,  which 
helped  to  enforce  the  consistency  of  the  modified  knowl¬ 
edge  [7,  27,  28,  29].  Kim  and  Gil  considered  the  use 
of  scripts  for  building  new  knowledge-acquisition  tools, 
and  created  a  system  for  evaluating  these  tools  [9,  10]. 
Blythe  et  al  designed  a  general  knowledge-acquisition 
interface  based  on  previous  techniques  [2]. 

Musen  developed  the  PROTEGE  environment  for  cre¬ 
ating  knowledge- acquisition  tools  [14,  16],  which  proved 
effective  for  the  development  of  knowledge  systems,  in¬ 
cluding  the  AIDS  expert  systems  [20],  asthma  treatment 
selection  [8],  and  elevator-design  rules  [21]. 

III.  Knowledge  Base 

Physicians  at  the  Moffitt  Cancer  Center  have  about 
150  clinical  trials  available  for  cancer  patients.  They 
have  identified  criteria  that  determine  a  patient’s  eligi¬ 
bility  for  each  trial,  and  they  use^these  criteria  to  select 
trials  for  eligible  patients.  Traditionally,  physicians  have 
selected  trials  by  a  manual  analysis  of  patients’  data. 
The  review  of  resulting  selections  has  shown  that  they 
usually  do  not  check  all  clinical  trials  and  occasionally 
miss  an  appropriate  trial. 

To  address  this  problem,  we  have  built  an  intelligent 
agent  that  helps  to  select  trials  for  each  patient.  It 
prompts  a  clinician  to  enter  the  results  of  medical  tests, 
and  uses  them  to  identify  appropriate  trials. 

In  Figure  1(a),  we  give  a  simplified  example  of  eligibil¬ 
ity  criteria  for  a  clinical  trial.  This  trial  is  for  young  and 


(a)  Eligibility  criteria 

1.  The  patient  is  female. 

2.  She  is  at  most  forty-five  years  old. 

3.  Her  cancer  stage  is  II  or  III. 

4.  Her  cancer  is  not  invasive. 

5.  At  most  three  lymph  nodes  have  tumor  cells. 

6.  Either 

•  the  patient  has  no  cardiac  arrhythmias,  or 

•  all  tumors  are  smaller  than  2.5  centimeters. 

(b)  Tests  and  questions 

General  information 
What  is  the  patient’s  sex? 

What  is  the  patient’s  age? 

Mammogram,  Cost  is  $150 
What  is  the  cancer  stage? 

Does  the  patient  have  invasive  cancer? 

Biopsy,  Cost  is  $300 
What  is  the  cancer  stage? 

How  many  lymph  nodes  have  tumor  cells? 

What  is  the  greatest  tumor  diameter? 

Electrocardiogram,  Cost  is  $200 

Does  the  patient  have  cardiac  arrhythmias? 

(c)  Eligibility  expression. 

sex  =  FEMALE  and 
age  <  45  and 
cancer- stage  E  {ll,  III}  and 
invasive- cancer  —  NO  and 
lymph-nodes  <  3  and 
( arrhythmias  —  NO  or 
tumor- diameter  <  2.5) 

Fig.  1.  Example  of  eligibility  criteria,  tests,  and  questions. 


middle-aged  women  with  a  noninvasive  cancer  at  stage 
II  or  III.  When  testing  a  patient’s  eligibility,  a  clinician 
has  to  order  three  medical  tests  (Figure  lb) .  The  agent 
first  prompts  the  clinician  to  enter  the  patient’s  sex  and 
age.  If  the  patient  satisfies  the  corresponding  condi¬ 
tions,  the  agent  asks  for  the  mammogram  results  and 
verifies  Conditions  3  and  4;  then,  it  requests  the  biopsy 
and  electrocardiogram  data. 

The  agent’s  knowledge  base  includes  questions,  tests, 
and  logical  expressions  that  represent  eligibility  for  each 
trial.  We  give  an  example  of  tests  and  questions  in  Fig¬ 
ure  i(b),  and  a  logical  expression  in  Figure  1(c). 

The  agent  supports  three  types  of  questions;  the  first 
type  takes  a  yes/no  response,  the  second  is  multiple 
choice,  and  the  third  requires  a  numeric  answer.  For 
example,  the  cancer  stage  is  a  multiple-choice  question, 
and  the  tumor  diameter  is  a  numeric  question.  The  de¬ 
scription  of  a  medical  test  includes  the  test  name,  dollar 
cost,  and  list  of  questions  that  can  be  answered  based 
on  the  test  results.  For  instance,  the  mammogram  in 
Figure  1  has  a  cost  of  $150,  and  it  allows  the  answering 
of  two  questions.  Different  tests  may  answer  the  same 
question;  for  example,  both  mammogram  and  biopsy 
show  the  cancer  stage. 


We  encode  the  eligibility  for  a  clinical  trial  by  a  log¬ 
ical  expression,  which  may  include  variables  that  rep¬ 
resent  the  available  medical  data,  as  well  as  equalities, 
inequalities,  “set-element”  relations,  conjunctions,  and 
disjunctions.  For  example,  we  encode  the  criteria  in 
Figure  1(a)  by  the  expression  in  Figure  1(c). 

The  agent  collects  data  until  it  can  determine  whether 
the  eligibility  expression  is  TRUE  or  FALSE.  For  instance, 
if  a  patient’s  sex  is  MALE,  then  the  expression  in  Fig¬ 
ure  1(c)  is  FALSE,  and  the  agent  immediately  rejects  this 
trial.  If  the  sex  is  FEMALE,  the  agent  has  to  ask  more 
questions.  If  the  knowledge  base  includes  many  clinical 
trials,  the  agent  checks  a  patient’s  eligibility  for  each  of 
them.  It  first  asks  for  the  tests  related  to  multiple  trials, 
and  then  requests  additional  tests  for  specific  trials. 

IV.  Entering  Eligibility  Criteria 

We  have  designed  a  web-based  interface  for  adding 
new  clinical  trials  [18],  which  consists  of  two  main  parts; 
the  first  part  is  for  adding  information  about  medical 
tests  (Figure  2),  and  the  second  is  for  eligibility  crite¬ 
ria  (Figure  3).  The  interface  includes  ten  screens;  two 
of  them  are  “start  screens,”  which  can  be  reached  from 
any  other  screen.  We  give  an  example  of  entering  eli¬ 
gibility  criteria,  describe  the  two  parts  of  the  interface, 
and  present  experiments  on  its  effectiveness. 

Example:  Suppose  that  a  user  needs  to  enter  the  cri¬ 
teria  shown  in  Figure  1.  First,  she  utilizes  the  “ Adding 
tests”  screen  to  enter  the  three  tests  (Figure  4).  Then, 
she  adds  the  related  questions;  to  enter  questions  for 
a  specific  test,  she  selects  the  test  and  clicks  u Modify ” 
(Figure  4),  and  the  agent  displays  the  “ Modifying  a  test” 
screen  (Figure  5).  To  add  a  question,  she  clicks  the 
appropriate  button  at  the  bottom  (Figure  5)  and  then 
types  the  question  (Figure  6). 

After  adding  the  questions  for  all  tests,  the  user  goes 
to  the  “Adding  clinical  trials”  screen  and  initializes  a 
new  trial  (Figure  7).  She  gets  the  “Selecting  tests” 
screen  and  chooses  the  tests  related  to  the  current  trial 
(Figure  8).  Then,  she  marks  relevant  questions  and  the 
answers  that  make  a  patient  eligible  (Figure  9).  If  the 
eligibility  criteria  include  disjunctions,  she  has  to  use  the 
screen  for  composing  logical  expressions  (Figure  10). 

Tests  and  questions:  The  interface  for  adding  tests 
and  questions  includes  six  screens  (Figure  2).  The  start 
screen  is  for  viewing  the  available  tests  and  defining  new 
ones,  whereas  the  other  screens  are  for  modifying  tests 
and  adding  questions. 

We  show  the  start  screen  in  Figure  4;  its  left-hand  side 
allows  viewing  questions  and  going  to  a  modification 
screen.  If  the  user  selects  a  test  and  clicks  “View,  ”  the 
agent  shows  the  questions  related  to  this  test.  If  the  user 
clicks  “ Modify ,  ”  it  displays  the  “Modifying  a  test”  screen 
(Figure  5).  The  right-hand  side  of  the  start  screen  allows 
adding  a  new  test  by  specifying  its  name  and  cost. 

The  “Modifying  a  test”  screen  shows  the  information 
about  a  specific  test,  which  includes  the  test  name,  cost, 
and  related  questions.  The  user  can  change  the  test 
name  and  cost;  the  four  bottom  buttons  allow  moving 
to  the  screens  for  adding  and  deleting  questions. 


Fig.  2.  Entering  tests  and  questions.  We  show  the  screens  by  rectangles  and  the  transitions  between  them  by  arrows.  The 
bold  rectangle  is  the  start  screen. 


Fig.  3.  Entering  eligibility  criteria. 
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Fig.  4.  Adding  a  new  test. 
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Fig.  5.  Modifying  a  test;  the  bottom  buttons  are  for  moving  to  question- entry  screens. 
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(b)  Multiple-choice  question. 


Fig.  6.  Adding  new  questions;  the  user  enters  a  question  and  answer  options. 
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Fig.  7.  Adding  a  new  clinical  trial. 


Fig.  9.  Selecting  questions  and  answers.  The  user  checks  the  questions  for  the  current  clinical  trial  and  marks  the  answers 
that  satisfy  the  eligibility  criteria. 
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Fig.  10.  Combining  questions  into  a  logical  expression. 


We  show  the  screens  for  adding  yes/no  and  multiple- 
choice  questions  in  Figure  6;  the  screen  for  numeric  ques¬ 
tions  is  similar.  The  user  can  enter  a  new  question  for 
the  current  test,  along  with  a  set  of  allowed  answers.  If 
the  question  is  also  related  to  other  tests,  the  user  has  to 
mark  them  in  the  lower  box.  The  “Deleting  questions” 
screen  is  for  removing  old  questions. 

Eligibility  conditions:  The  mechanism  for  entering 
eligibility  criteria  consists  of  four  screens  (Figure  3). 
The  start  screen  allows  the  user  to  initialize  a  new  clin¬ 
ical  trial  and  view  the  criteria  for  old  trials.  If  the 
user  needs  to  modify  a  clinical  trial,  the  agent  first 


displays  the  test-selection  screen  (Figure  8).  The  user 
then  chooses  related  tests  and  question  types,  and  clicks 
“Continue”  to  get  the  question  list. 

The  next  screen  (Figure  9)  allows  the  user  to  select 
specific  questions  and  mark  the  answers  that  make  a 
patient  eligible.  For  a  multiple-choice  question,  the  user 
may  specify  several  eligibility  options;  for  example,  a 
patient  may  be  eligible  if  her  cancer  stage  is  II  or  III. 
For  a  numeric  question,  the  user  has  to  specify  a  range 
of  values;  for  instance,  a  patient  may  be  eligible  if  her 
age  is  between  0  and  45  years.  If  the  user  clicks  “Sim¬ 
ple  questions,  ”  the  agent  generates  a  conjunction  of  the 


Fig.  11.  Entry  time  for  test  sets  (left)  and  the  mean  time  per  question  for  each  set  (right).  We  plot  the  average  time  (dashed 
lines)  and  the  time  of  the  fastest  and  slowest  users  (vertical  bars). 
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Fig.  12.  Entry  time  for  eligibility  criteria.  We  show  the  average  time  for  each  clinical  trial  and  the  time  per  question  (dashed 
lines),  along  with  the  performance  of  the  fastest  and  slowest  users  (vertical  bars). 


selected  criteria.  If  the  eligibility  conditions  involve  a 
more  complex  expression,  the  user  has  to  click  “ Com¬ 
bined  question ”  and  then  use  the  screen  for  composing 
logical  expressions  (Figure  10). 

Entry  time:  We  have  run  experiments  with  sixteen 
novice  users,  who  had  no  prior  experience  with  the  inter¬ 
face.  First,  every  user  has  entered  four  sets  of  medical 
tests;  each  set  has  included  three  tests  and  ten  ques¬ 
tions.  Then,  each  user  has  added  eligibility  expressions 
for  ten  clinical  trials  used  at  the  Moffitt  Cancer  Center; 
the  number  of  questions  in  an  eligibility  expression  has 
varied  from  ten  to  thirty- five. 

We  have  measured  the  entry  time  for  each  test  set  and 
each  eligibility  expression.  In  Figure  11,  we  show  the 
mean  time  for  every  test  set  and  the  time  per  question 
for  the  same  sets.  All  users  have  entered  the  test  sets 
in  the  same  order,  from  1  to  4;  since  they  had  no  prior 
experience,  their  performance  has  improved  during  the 
experiment.  In  Figure  12,  we  give  similar  graphs  for  the 
entry  of  eligibility  expressions. 

The  experiments  have  shown  that  novices  can  effi¬ 
ciently  use  the  interface;  they  quickly  learn  its  full  func¬ 
tionality,  and  their  learning  curve  flattens  after  about 
an  hour.  The  average  time  per  question  is  31  seconds 
for  the  entry  of  medical  tests  and  37  seconds  for  eligi¬ 
bility  criteria,  which  means  that  a  user  can  enter  all  150 
cancer  trials  used  at  Moffitt  in  about  two  weeks. 


V.  Concluding  Remarks 

We  have  developed  knowledge-acquisition  tools  for  an 
agent  that  automatically  assigns  cancer  patients  to  clin¬ 
ical  trials.  We  have  described  the  representation  of  eligi¬ 
bility  criteria  and  a  web-based  interface  for  adding  new 
trials.  The  experiments  have  shown  that  a  user  can  en¬ 
ter  a  new  trial  in  fifteen  to  thirty  minutes.  Novices  can 
use  the  interface  without  prior  instructions,  and  they 
reach  their  full  speed  after  about  an  hour.  Although 
cancer  research  at  Moffitt  has  provided  the  motivation 
for  this  work,  the  agent  is  not  limited  to  cancer,  and  we 
can  use  it  for  trials  related  to  other  diseases. 
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